A Longitudinal Study of Task Performance, Head Movements, Subjective Report, Simulator Sickness, and Transformed Social Interaction in Collaborative Virtual Environments

نویسندگان

  • Jeremy N. Bailenson
  • Nick Yee
چکیده

Empirical research on human behavior in collaborative virtual environments (CVEs) is in its infancy. Historically, one of the more valuable tools social scientists have used to evaluate new forms of media is longitudinal studies that examine user behavior over an extended period of time. In the current study, three triads of participants came to the lab for 15 sessions over a ten week period to collaborate for approximately 45 minutes per session. We examined nonverbal behavior, task performance on verbal tasks, and subjective ratings of presence, copresence, simulator sickness, and entitativity over time. Furthermore, we examined two types of transformed social interaction: nonverbal mimicry and facial similarity. Results demonstrated substantial differences in task performance, subjective ratings, nonverbal behavior, and simulator sickness over time as participants became familiar with the system. Furthermore, transforming avatar appearance to increase facial similarity sometimes improved task performance. We discuss implications for research on CVEs. Longitudinal CVE Research 3 Introduction Researchers in communication, computer science, and psychology are beginning to dedicate serious research efforts towards developing and understanding collaborative virtual environments (CVEs). CVEs are systems which allow geographically separated individuals to interact with one another via digital avatars which are rendered to look and behave consistently with their human counterparts, whose behaviors are measured by various forms of tracking equipment in real-time. Many researchers are optimistic that CVEs will have the unique ability to overcome problems associated with traditional teleconferencing equipment (Lanier, 2001) such as transmission delays and eye-contact problems resulting from the difficulty of dynamic camera placement. Given that the study of CVEs is a relatively new endeavor, existing research on user behavior in CVEs is in its infancy. While a number of empirical studies examine task performance and quality of experience in CVEs, without exception, all of these studies examine experimental participants for whom the experiment itself is largely the first time they have experienced a CVE. Even the few studies which have utilized people with CVE experience did not systematically track those users’ learning and adjustments to the extremely novel technology over time. CVE studies typically focus on how presence, co-presence and task performance are influenced by particular aspects of the CVE, such as embodiment (Benford, Bowers, Fahlen, Greenhalgh, & Snowdon, 1995), awareness (Benford et al., 1995; Dourish & Bellotti, 1992), avatar realism (Casanueva & Blake, 2000), the presence of eye contact (Bailenson, Beall, & Blascovich 2002; Garau et al., 2003) or whether the CVE is Longitudinal CVE Research 4 multiscale or not (Zhang & Furnas, 2002). Others have focused on social processes in CVEs, for example, formation of trust in persistent online virtual worlds (Schroeder & Axelsson, 2000; Yee, in press) and how immersion might confer leadership (Steed, Slater, Sadagic, Bullock, & Tromp, 1999). Also well-known is the COVEN project that explored more technical requirements and scalability of CVEs (Normand et al., 1999). Longitudinal research in the laboratory, that is, empirical studies which track users over extended periods of time, is rare in behavioral research largely due to the laborious nature of bringing the same people back to the lab over and over again. For example, until the seminal work of Termin (1916), people with high intelligence were regarded by the general populace as social outcasts who could not form healthy relationships with others. Only after Termin followed the social development of gifted children over time did he prove with longitudinal data that not all smart people are destined to spend their lives with only books as acquaintances. There are countless instances of longitudinal data being essential to answering behavioral science questions, especially in regards to evaluation of new media use (Huesmann, Lagerspetz, & Eron, 1984). Consequently, in the current work, we embarked upon a longitudinal study of user behavior in CVEs. We believed that longitudinal research examining CVEs would be especially important given the novelty of the equipment and the difficulty in adjusting to immersion in a single session, and predominantly were interested in two important issues. First, we sought to examine the way in which human behavior changes over time with CVE use and experience (e.g., IJsselsteijn et al., 1997). Second, we manipulated various Longitudinal CVE Research 5 types of transformed social interaction (TSI) and examined the effect of TSI on human behavior over time. TSI is a research paradigm (Bailenson & Beall, in press; Bailenson, Beall, Loomis, Blascovich, & Turk, 2004) that examines the disjoint between human characteristics and behaviors that exist in physical space and the characteristics and behaviors that are rendered to others in a CVE. Because behaviors are tracked and rendered in CVEs, as opposed to directly transmitted via an analogue-type of information stream, interactants have the ability to filter, augment, or block their own behaviors from the eyes and ears of their conversational partners. Previous research on TSI has examined the effects of being able to look directly in the eye of more than one other person at once (Beall, Bailenson, Loomis, Blascovich, & Rex, 2003), mimicking the head movements of other interactants (Bailenson, Beall, Blascovich, Loomis, & Turk, 2004; Bailenson & Yee, 2005), and morphing the faces of interactants to absorb facial features of conversational opponents (Bailenson, Garland, Iyengar, & Yee, 2005). However, oftentimes TSI is detrimental in terms of conversational flow and interactional synchrony (Kendon, 1977), because natural movements are being replaced with algorithmic ones, destroying sensitive interplay between verbal and nonverbal cues. Consequently, we wanted to trace the adjustment to TSI algorithms over time. In the current study, we sought to further explore the notion of using TSI specifically to instill similarity among team members. There is a wealth of research that indicates that teamwork functions better when team members are “in synch” with one another. In other words, when team members are more similar to one another, work proceeds more effectively. This effect occurs when similarity is defined in terms of Longitudinal CVE Research 6 demographics (Kirkman, Tesluk, & Rosen, 2004) or in terms of personality (Reeves & Nass, 1996). There is ample reason to believe that TSI can be used to accelerate this process of building team familiarity, comfort and similarity. We manipulated two types of TSI: behavioral team similarity and visual team similarity. To accomplish behavioral similarity, we implemented a head-movement mimic algorithm. In other words, for each interactant in the CVE, the other two interactants’ avatars in the CVE mimicked his or her head movements (regardless of the actual head movement behavior of those other two interactants). In this mimic condition, all three participants saw their team members mimic them simultaneously with a four second delay. Previous research has shown that people are more influenced by other people who mimic their language (van Baaren, Holland, Steenaert, & van Knippenberg, 2003) or their gestures (Chartrand & Bargh, 1999) than those that do not mimic them during social interaction. Moreover, this trend also occurs with digital computer agents: voice synthesizers that mimic vocal patterns (Suzuki, Takeuchi, Ishii, & Okada, 2003) as well as embodied agents in immersive virtual reality that mimic nonverbal behavior (Bailenson & Yee, 2005). We also examined visual similarity. All interactants within the CVE had their avatar constructed to be a photographically realistic analog of their own head and face. Research in social psychology has demonstrated large-scale effects of similarity on social influence. An individual judged more similar to a given person (compared to a less similar individual) is considered more attractive (Berscheid & Walster, 1974; Shanteau & Nagy, 1979), persuasive (Byrne, 1971), is more likely to receive political support (Bailenson, Garland, Iyengar, & Yee, 2005) and is more likely to elicit altruistic helping Longitudinal CVE Research 7 behavior in a dire situation (Dovidio, Gaertner, Anastasio, & Sanitioso, 1992). Consequently, in the visual similarity condition, for each interactant in the CVE, the digitized face of an individual interactant was used as the virtual face for both remaining interactants in the group of three. In other words, interactant A saw the avatars of interactant B and C as “wearing” the identical face of interactant A. It is important to note that the notion of team similarity is very complex. Wellfunctioning teams have a wide range of both convergent and divergent attributes among team members. In the current work, we only examine the possibility of manipulating superficial, surface similarity among team members as a way to improve performance. Method Participants Nine undergraduate students (six female, three male) from an introductory communication course participated in the study for partial course credit. None had used immersive virtual reality more than once before. None of the participants were pregnant or had epilepsy, and all were native speakers of English. The three male subjects formed one group; the six female subjects were randomly assigned into two additional groups of three. Design We manipulated one independent variable: TSI condition, which had three levels: normal, face similarity, and mimic. In the normal condition, participants saw one another wearing the appropriate (i.e., their own) faces and gesturing veridically. In the face similarity condition, each participant saw the other two participants wearing his or her face but gesturing veridically. In other words, from the point of view of participant C, Longitudinal CVE Research 8 participants A and B gestured normally but each wore the visual face model of C. In the mimic condition, each participant saw the other two wearing their correct face models (see Figure 3), but each of them mimicked his or her head movements at a four second lag. In other words, participants A and B gestured with the head movements of C, but each wore the appropriate face. It is important to note that the field of view of the headmounted display prevented any participant from seeing both of the other participants mimicking him simultaneously. Each group collaborated with one another for fifteen trials. The trials were spread across a ten week academic quarter. Appendix A demonstrates the timetable for each group, and demonstrates that each group participated in each TSI condition five times. For each of the fifteen trials, each group received one of the three TSI conditions; in essence the TSI manipulation was a within-trial design. This term is used (as opposed to within-subject) because we use trial as a random factor in our statistical analysis instead of subject. Procedure Before the experiment began, all nine participants arrived at the laboratory and met one another face-to-face. They then had the procedure thoroughly explained, including specifications about the virtual reality equipment as well as the instructions on how to carry out the specific problem-solving tasks. Participants were instructed that they would be scored equally for quantity and accuracy of their responses. Furthermore, they had photographs taken of their faces to allow the construction of realistic avatars for use over the next ten weeks. During this pre-experiment meeting we explicitly described the three different experimental conditions and showed them the CVE. Upon leaving this Longitudinal CVE Research 9 meeting they were all aware that they would be mimicked in certain trials, that other people would wear their own face in certain trials, and that they could only see one other participant in any given field of view. For each experimental session, one group of three participants arrived at the laboratory, filled out a consent form, and then each participant sat at an immersive virtual reality station. The stations were in opposite corners of a large room, and participants could not easily see one another if they were to take off their HMDs. First they engaged in a series of problem solving tasks while immersed in virtual reality. Then, they went to a separate room and filled out a number of self report questionnaires. Problem solving tasks. Once the participants had successfully donned their equipment, each entered a CVE in which the other two participants were sitting around a table. They then began to perform three types of problem solving tasks in an order specified according to the counterbalancing scheme depicted in Appendix A. The tasks were administered verbally by the operator, an experimenter who could be heard via the audio system but had no visual presence inside the CVE. In between each set of tasks, participants took a mandatory three minute break in which they removed their HMD and audio equipment. A full list of the stimuli appears in Appendix B. The first task was to perform three iterations of twenty questions, a task used previously to assess performance in CVEs (Bailenson, Beall, & Blascovich, 2002). During the twenty questions task, the operator would moderate the trials for each designated word. Members of the group would then deliberate about a “yes/no” question to ask that would narrow down the scope of what the word might be. When they agreed upon an appropriate question, one of the members would provide an “official” question, Longitudinal CVE Research 10 and the operator would answer it. For each iteration, the group had 20 questions or five minutes to determine the word. Providing corrective feedback for the participant is inherent in the 20 questions task. The second task was to perform three iterations of the reverse remote association task, also used previously to assess performance in CVEs (Hoyt & Blascovich, 2003). For each iteration, participants received a word from the operator and were instructed to generate three words that were related to that one word. For example, given the word “ball”, participants could generate the word “party”, “bounce”, and “bearing”. Participants were told to keep two goals in mind equally when generating triads: 1) come up with as many as possible, and 2) come up with ones that were as creative as possible. We gave them some examples of uncreative ones (e.g., “basket”, “base”, and “foot”) as well as more creative ones, in which individual members of the triad related to the target in different semantic manners. For each iteration, the group had five minutes to generate as many triads as possible, and during each experimental trial, a group received three separate RAT games with different words. There was no corrective feedback given by the operator about their responses. The third task was to solve a single insight problem. Participants read the problem via three dimensional text projected inside the virtual world and could reread the problem by asking the operator. The problems were primarily taken from Weisberg (1995), and were designed foster collaborative thought in order to produce a solution. In other words, unlike a “hill-climbing” problem, in which one nears the solution in proportion to the amount of thought dedicated to the problem (i.e., filling in a crossword puzzle), solving insight problems require creativity, and regardless of the amount of work Longitudinal CVE Research 11 performed, one gets no closer to the solution without a flash of insight. Participants worked until they solved the problem or a ten minute time limit was up. There was no corrective feedback given by the operator about their responses. Across these three types of tasks, participants spent about 35-40 minutes total immersed during each experimental trial. After completing the third task they removed the HMD, went to a separate room, and filled out a number of self-report questionnaires. Self report tasks. While in a separate room from the immersive virtual reality equipment, participants sat down at a desk and completed a number of questionnaires. These measures are depicted in full in Appendix C. The first was a four-item Presence questionnaire, designed to measure how immersed participants were in the virtual world, as opposed to the physical world. Cronbach’s alpha, a measure commonly used to assess reliability of scales, was .75. The second was a five-item, Copresence questionnaire, designed to measure how human-like and socially relevant the other avatars in the room were. This questionnaire was used previously in CVE research (Bailenson et al., 2002; Bailenson, Blascovich, Beall, & Loomis, 2003), and in this study Cronbach’s alpha was .85. The third was a ten-item, Entitativity questionnaire that measured how cohesive the group of three individuals were. In other words, a tightly-knit, well-functioning group would be high in entitativity, while a non-cohesive collection of disjointed individuals would be low. Items from this questionnaire were based on previous work (Maner et al., 2002), and in this study Cronbach’s alpha was .85. The fourth was a sixteen item simulator sickness questionnaire (Kennedy, Lane, Berbaum, & Lilienthal, 1993). Participants marked the extent to which they were Longitudinal CVE Research 12 experiencing various aftereffects on a scale from one to four with higher numbers indicating greater severity. Cronbach’s alpha was .71 for this measure. Finally, after each session, participants wrote an open-ended paragraph describing their experience that day in virtual reality. On average, the total self report session took about fifteen minutes. Materials and Apparatus. Perspectively correct stereoscopic images were rendered by a 1700 MHz Pentium IV computer with an NVIDIA GeForce FX 5200 or 5950 graphics card, and were updated at an average frame rate of 60 Hz. The simulated viewpoint was continually updated as a function of the participants’ head movements, which were tracked by a three-axis orientation sensing system (Intersense IS250, update rate of 150 Hz). The system latency, or delay between a participant's head movement and the resulting concomitant update in the HMD's visual display was 45 ms maximum. The software used to assimilate the rendering and tracking was Vizard 2.14. Figure 1 shows a participant donning the equipment. Participants wore a Virtual Research V8 stereoscopic head mounted display (HMD) that featured dual 680 horizontal by 480 vertical pixel resolution panels that refreshed at 60 Hz, or an nVisor SX HMD that featured dual 1280 horizontal by 1024 vertical pixel resolution panels that refreshed at 60 Hz. On both types of HMD, the display optics presented a visual field subtending approximately 50 degrees horizontally by 38 degrees vertically. Longitudinal CVE Research 13 Figure 1: A participant wearing the HMD. Participants wore microphones and we used custom, real-time audio sampling software to measure the instantaneous speech sound levels captured near each participant’s mouth. When the amplitude was over a certain threshold, their avatars opened their mouths to indicate speech. We sampled the microphone amplitude at a rate of 20 Hz. Furthermore, each participant had speakers placed near his or her station in order to clearly hear the utterances of the other two group members. The speech was non-spatialized, in that the sound did not emanate from the exact digital space containing the speaking avatar. Longitudinal CVE Research 14 Figure 2: A participants' view of another participant's avatar. Figure 2 demonstrates a sample point of view for a given participant. It was not possible for a participant to see both of the other two participants simultaneously; this setup was chosen to encourage head movements, which were easier to track and record than eye movements given our apparatus. The avatar faces were constructed from a series of photographs with software by 3dMeNow; previous research has indicated that these digital models are extremely high in similarity (both objectively and psychologically) to the faces from which they were modeled (Bailenson, Beall, Blascovich, & Rex, 2004). Figure 3 shows a sample figure with the faces of the nine participants. The only behaviors utilized by avatars were head movements, speech, mouth movements, and blinking (according to a random algorithm based on human blink rates). Longitudinal CVE Research 15 Figure 3: Models used as avatars for our nine participants in this study.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Transformed Social Interaction , Augmented Gaze , and Social Infl uence in Immersive Virtual Environments

Immersive collaborative virtual environments (CVEs) are simulations in which geographically separated individuals interact in a shared, three-dimensional, digital space using immersive virtual environment technology. Unlike videoconference technology, which transmits direct video streams, immersive CVEs accurately track movements of interactants and render them nearly simultaneously (i.e., in r...

متن کامل

Simulation Sickness Comparison between a Limited Field of View Virtual Reality Head Mounted Display (oculus) and a Medium Range Field of View Static Ecological Driving Simulator (eco2)

In this article, an experimental procedure is presented in order to evaluate the role of having HMD oculus and (Eco2 driving simulator) in terms of driving simulation sickness. The driving simulation sickness is investigated with respect to SSQ (simulator sickness questionnaire) and vestibular dynamics (head movements) of the driver participants for a specific driving scenario. The scenario of ...

متن کامل

Assessing Simulator Sickness in a See-through Hmd: Effects of Time Delay, Time on Task, and Task Complexity

Advances in helmet-mounted displays (HMDs) have permitted the design of “see-through” displays in which virtual imagery may be superimposed upon real visual environments. Such displays have numerous potential applications; however, their promise to improve human perception and performance in complex task environments is threatened by numerous technological challenges. Moreover, users of HMDs ma...

متن کامل

Evaluating Control Schemes for the Third Arm of an Avatar

Recent research on immersive virtual environments has shown that users can not only inhabit and identify with novel avatars with novel body extensions, but also learn to control novel appendages in ways beneficial to the task at hand. But how different control schemas might affect task performance and body ownership with novel avatar appendages has yet to be explored. In this article, we discus...

متن کامل

Head-Mounted Displays for Clinical Virtual Reality Applications: Pitfalls in Understanding User Behavior while Using Technology

The use of virtual environments with head-mounted displays (HMDs) offers unique assets to the evaluation and therapy of clinical populations. However, research examining the effects of this technology on clinical populations is sparse. Understanding how wearers interact with the HMD is vital. Discomfort leads to altered use of the HMD that could confound performance measures; the very measures ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Presence

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2006